Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task

نویسندگان

  • Junjun Wang
  • Guoyu Tang
  • Yunqing Xia
  • Qiang Zhou
  • Thomas Fang Zheng
  • Qinan Hu
  • Sen Na
  • Yaohai Huang
چکیده

Understanding intent underlying search query recently attracted enormous research interests. Two challenging issues are worth noting: First, words within query are usually ambiguous while query in most cases is too short to disambiguate. Second, ambiguity in some cases cannot be resolved according merely to the limited query context. It is thus demanded that the ambiguity be resolved/analyzed within context other than the query itself. This paper presents the intent mining system developed by THCIB and THUIS, which is capable of understanding English and Chinese query respectively, with four types of context: query, knowledge base, search results and user behavior statistics. The major contributions are summarized as follows: (1) Extracted from the query, concepts are used to extend the query; (2) Concepts are used to extract explicit subtopic candidates within Wikipedia. (3) LDA is applied to discover explicit subtopic candidates within search results. (4) Sense based subtopic clustering and entity analysis are conducted to cluster the subtopic candidates so as to discover the exclusive intents. (5) Intents are ranked with a unified intent ranking model. Experimental results indicate that our intent mining method is effective.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of the NTCIR-12 IMine-2 Task

In this paper, we provide an overview of the NTCIR-12 IMine-2 task, which is a core task of NTCIR-12 and also a succeeding work of IMine@NTCIR-11, INTENT-2@NTCIR-10, and INTENT@NTCIR-9 tasks. IMine-2 comprises the Query Understanding subtask and the Vertical Incorporating subtask. 23 groups from diverse countries including China, France, India, Portugal, Ireland, and Japan registered to the tas...

متن کامل

KDEIM at NTCIR-12 IMine-2 Search Intent Mining Task: Query Understanding through Diversified Ranking of Subtopics

In this paper, we describe our participation in the Query Understanding subtask of the NTCIR-12 IMINE Task. We propose a method that extracts subtopics by leveraging the query suggestions from search engines. The importance of the subtopics with the query is estimated by exploiting multiple query-dependent and query-independent features with supervised feature selection. To diversify the subtop...

متن کامل

THUIR at NTCIR-12 IMine Task

In this paper, we describe our approaches in the NTCIR12 IMine task, including Chinese Query Understanding and Chinese Vertical Incorporating. In Query Understanding subtask, we propose different strategies to mine subtopic candidates from a wide range of resources and present a twostep method to predict the vertical intent for each subtopic. In Vertical Incorporating subtask, we adopt a probab...

متن کامل

Mining Search Subtopics from Query Logs

Web queries are usually short and ambiguous. Subtopic mining plays an important role in understanding user’s search intent and has attracted many researchers' attention. In this paper, we describe our approach to identify users’ intents from query logs, which is a subtopic mining subtask of the NTCIR-9 Intent task for Chinese. We extract queries that are semantically related to the original que...

متن کامل

TUTA1 at the NTCIR-12 Temporalia Task

Our group submitted task for Temporal Intent Disambiguation (TID) Subtask (Chinese) of NTCIR-2012. We using word2vec to model query String into feature vector, and using cos function to measure the similarity between query string and training corpus SougouCA. Our results shows the approach is efficient for solving thoes Task.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013